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SUMMARY 


Most  approaches  to  weather  and  cloud  forecasting  entail  the  use  of  a  large  numerical  weather 
prediction  code.  These  codes  assimilate  many  forms  of  current  weather  data  and  then  propagate 
that  weather  into  the  future  using  the  governing  dynamic  equations.  PSR  has  developed  a  cloud 
forecast  model  based  on  a  neural  network  (NN)  intended  for  eventual  (by  FY2002)  integration 
into  the  Cloud  Depiction  and  Forecast  System  n  (CDFS 11)  at  Air  Force  Global  Weather  Central. 
Work  to  date  indicates  a  capability  to  analyze  and  forecast  clouds  can  be  accomplished  on  a 
single  workstation  and  satellite  data  feed  within  a  theater  of  operations,  potentially  with  no 
connection  to  a  weather  center. 

The  cloud  forecast  model  is  founded  on  established  forecast  principles  combined  with  promising 
new  analysis  techniques.  The  model  is  based  on  three  fundamental  processes:  persistence, 
advection  and  evolution.  These  processes  encompass  the  full  range  of  atmospheric  time  and 
length  scales  that  influence  short-  and  extended-range  cloud  forecasts.  Current  meteorological 
methods  are  used  to  defuie  the  basic  elements  of  each  process  but  a  NN  is  employed  to  analyze 
and  combine  the  elements. 

The  current  PSR  worldwide  cloud  prediction  model  (WCPM)  is  based  upon  a  pixel-by-pixel  im¬ 
plementation  of  a  NN.  The  advection  of  clouds  within  a  pixel  is  traced  through  time.  The 
temporal  evolution  of  a  pixel  is  estimated  from  past  data.  The  persistence  of  cloud  properties  at 
a  pixel  is  estimated  from  past  data.  The  forecast  is  based  upon  this  pixel  level  analysis  and  is 
almost  independent  of  changes  in  neighboring  pixels. 

The  NN  was  trained  on  SERCAA  cloud  images  from  both  the  Mediterranean  Sea  and  the 
Central/South  America  region.  Forecasts  up  to  12  hours  were  calculated  and  compared  to  troth. 
The  approach  demonstrated  the  ability  predict  both  the  advection  and  evolution  of  clouds. 

The  approach  has  considerable  difficulty  forecasting  scattered  clouds.  This  is  not  surprising 
given  that  the  scattered  clouds  are  completely  random  in  both  location  and  evolution.  The  best 
results  were  therefore  obtained  by  training  on  and  predicting  median  filtered  clouds.  RMS 
prediction  errors  of  ~20%  were  typical  for  the  WCPM  as  compared  to  rms  errors  of  ~30%  for 
tropical  HRCP  predictions.  The  dominant  errors  arise  from  over-predicting  scattered  clouds. 

The  scattered  clouds  also  were  found  to  adversely  affect  the  forecast  sharpness  by  smearing  the 
forecast 

The  limited  amount  of  data  available  for  training  severely  impacted  the  ability  of  the  NN  to 
forecast  A  NN  should  be  trained  on  a  variety  of  situations.  Our  data  is  dominated  by  evolved 


cloud  fields  with  scattered  clouds.  Advection  is  therefore  not  well  represented  in  the  data, 
neither  are  days  with  more  than  25%  cloud  fraction.  It  is  obvious  what  the  NN  best  predicts. 
This  can  be  remedied  only  by  further  training.  Further  training  will  not  improve  the  forecast  of 
random  clouds,  however. 

It  is  felt  that  both  of  these  problems  can  be  repaired  by  modifying  the  pixel-by-pixel  forecast 
approach.  An  object-oriented  approach  was  described  that  better  represents  the  regional  weather 
situation.  Pixels  are  no  longer  independent  of  their  surroundings  but  depend  upon  the  nature  of 
approach  (and  receding)  weather  conditions.  The  object-oriented  approach  addresses  the 
scattered  cloud  problem  by  including  randomness  as  part  of  object  descriptions. 
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SECTION  1 
INTRODUCTION 


Most  approaches  to  weather  and  cloud  forecasting  entail  the  use  a  large  numerical  weather  pre¬ 
diction  code.  These  codes  assimilate  many  forms  of  current  weather  data  and  then  propagate 
that  weather  into  the  future  using  the  governing  dynamic  equations.  An  alternative  approach  to 
cloud  forecasting  is  developed  here.  The  new  cloud  forecast  model  is  based  on  a  neural  network 
(NN)  intended  for  eventual  (in  FY2002)  integration  into  the  Cloud  Depiction  and  Forecast 
System  n  (CDFS II)  at  Air  Force  Global  Weather  Central. 

The  cloud  forecast  model  is  founded  on  established  forecast  principles  combined  with  promising 
new  analysis  techniques.  The  model  is  based  on  three  fundamental  processes:  persistence, 
advection  and  evolution.  These  processes  encompass  the  full  range  of  atmospheric  time  and 
length  scales  that  influence  short-  and  extended-range  cloud  forecasts.  Current  meteorological 
methods  are  used  to  define  the  basic  elements  of  each  process  but  a  NN  is  en^loyed  to  analyze 
and  combine  the  elements.  While  NNs  have  only  recently  been  applied  to  meteorological  prob¬ 
lems  (McCann,  1992;  Blanker!,  1993;  Welch,  et  al,,  1992),  they  are  widely  used  for  problems 
such  as  pattern  recognition  where  formal  analysis  is  often  ineffective.  NNs  are  attractive  for 
cloud  forecasting  because  they  (1)  are  robust  in  the  face  of  incomplete  data,  (2)  readily  accept 
data  from  widely  divergent  sources,  (3)  are  fast  once  trained,  and  (4)  can  model  nonlinear 
relationships. 

The  worldwide  cloud  prediction  model  (WCPM)  is  based  upon  a  pixel-by-pixel  implementation 
of  a  NN.  The  advection  of  clouds  within  a  pixel  is  traced  through  time.  The  temporal  evolution 
of  a  pixel  is  estimated  from  past  satellite  and  numerical  weather  prediction  data.  The  persistence 
of  cloud  properties  at  a  pixel  is  estimated  from  past  data.  The  forecast  is  based  upon  this  pixel 
level  analysis  and  is  almost  independent  of  changes  in  neighboring  pixels. 

This  report  documents  the  three  stages  of  project  development  First  individual  modules  were 
developed  to  forecast  clouds  using  only  advection,  persistence,  or  evolution.  A  separate  NN  was 
developed  for  each  module.  Second,  a  rigorous  analysis  was  performed  on  the  selected  predic¬ 
tors  in  an  effort  to  reduce  redundancy  and  eliminate  noisy  or  useless  predictors.  A  unified  NN 
was  developed  that  simultaneously  utilized  the  advection,  persistence  and  evolution  predictors. 
Lastly,  the  quality  of  the  resulting  forecasts  was  evaluated  using  skill  scores  on  data  not  used  for 
NN  training. 

The  cloud  forecast  model  developed  can  easily  run  in  a  standard  work  station  environment 
Over  the  limited  data  available  for  verification,  the  NN  developed  was  found  to  perform  some- 
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what  better  in  tropical  regions  than  the  current  High-Resolution  Cloud  Prognosis  (HRCP)  model. 
Lack  of  data  prevented  testing  on  a  global  basis. 

Performance  was  best  in  regions  of  significant  cloud  cover.  Poor  forecast  sharpness  was  associ¬ 
ated  with  regions  of  scattered  clouds.  The  NN  smeared  the  scattered  clouds  uniformly  across  the 
region.  This  was  due  both  to  the  limited  training  set  and  to  the  pixel-by-pixel  approach.  Based 
upon  the  forecast  capabilities  of  WCPM,  an  alternative,  fully  object-oriented  approach  to  the 
NN  is  outlined  to  improve  the  performance  and  forecast  sharpness  in  regions  of  scattered  clouds. 
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SECTION  2 

THE  NATURE  OF  CLOUD  FORECASTING 


All  forecasting  problems  start  with  a  conceptual  model  of  the  required  phenomenon  to  be  fore¬ 
cast,  in  our  case  clouds,  which  must  in  some  way  conform  to  the  way  the  clouds  are  sampled. 
Our  basic  requirement  is  to  forecast  cloud  fraction  and  altitude,  including  at  least  four  cloud  lay¬ 
ers  (if  present),  on  a  global  (or  regional)  basis.  Several  conceptual  models  are  possible  to  de¬ 
scribe  the  same  cloud  distribution,  yet  which  result  in  forecast  model  parameterizations. 

Models  differ  in  terms  of  the  parameterization  of  the  clouds  themselves,  and  in  the  parameteri¬ 
zation  of  the  cloud  distributions. 

The  primary  source  of  cloud  data  for  the  following  development  and  analysis  is  SERCAA  Level 
3  or  4  cloud  data  (Gustafson,  et  al.,  1994).  The  SERCAA  cloud  data  is  parameterized  in  terms 
of  cloud  fraction  (in  four  layers)  and  (eventually)  cloud  type.  A  complete  nephanalysis  is  per¬ 
formed  on  high  resolution  imagery  (2  to  5  km  pixel  resolution)  to  estimate  cloud  fraction  at  a 
1/16  mesh  scale  (approximately  25  x  25  km).  A  minimum  requirement  for  computer  forecast¬ 
ing  is  pixel-by-pixel  cloud  fraction;  cloud  type  designators  are  useless.  Neither  descriptor  pro¬ 
vides  cloud  image  information  (information  that  relates  clouds  at  one  pixel  to  clouds  at 

neighboring  pixels).  Neither  descriptor  provides  information  about  the  randomness  in  the  cloud 
scene. 

A  typical  cloud  scene  (Figure  2-1)  can  best  be  described  as  several  spatially  and/or  temporally 
correlated  features  with  both  mean  and  random  cloud  properties.  The  mean  properties  (shape, 
area,  average  cloud  cover)  of  features  A,  B,  C  and  D  are  potentially  predictable;  the  random 
properties  (the  pixel-by-pixel  cloud  fractions)  are  not  predictable,  although  their  statistics  might 
well  be.  Here  feature  A  has  a  mean  cloud  fraction  (23%)  and  cloud  top  temperature  (300  °K). 
The  random  cloud  distribution  within  the  feature  might  be  described  by  higher  moments  or  a 
fractal  dimension,  but  is  only  predictable  in  a  statistical  sense. 

Feature  D  is  an  example  of  a  relatively  stable  cloud  distribution  with  very  low  cloud  fraction. 
Only  isolated  clouds  (a  few  pixels  each)  are  present  that  appear  to  jump  around  from  temporal 
image  to  temporal  image.  Here,  either  the  clouds  are  evolving  and/or  advecting  much  more 
rapidly  than  the  sampling  time  so  a  consistent  picture  of  the  detailed  clouds  is  not  available.  The 
clouds  appear  to  be  randomly  distributed  throughout  the  feature  although  the  total  cloud  fraction 
is  relatively  constant. 
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Figure  2-1 .  Cloud  scene  from  the  Mediterranean  Sea  and  North  Africa  showing  several  cloud 
features  (A,  B,  C,  D).  North  is  approximately  to  the  left  on  the  image. 


The  remaining  clouds  not  included  in  the  four  features  are  random  with  temporal  sampling.  It 
might  be  possible  to  predict  a  mean  “background”  cloud  fraction  but  not  its  spatial  distribution. 

The  above  discussion  of  predictable  and  random  clouds  is  highly  contingent  upon  the  data  reso¬ 
lution  and  sampling  time.  A  much  higher  sampling  rate  (than  one  image  per  hour)  would  allow 
smaller  clouds  to  be  better  tracked.  Higher  resolution  (than  the  1/16*  mesh  SERCAA  data) 
would  allow  smaller  significant  features  to  be  defined.  At  some  resolution,  however,  clouds  are 
randomly  created  and  destroyed.  The  conditions  under  which  clouds  form,  grow  or  diminish  are 
predictable,  however,  clouds  are  known  to  form  randomly  in  space  within  those  conditions. 
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Therefore,  the  detailed  cloud  distribution  within  regions  of  light  clouds  will  always  be  a  random 
process  and  not  predictable. 

Two  parameterizations  of  cloud  distributions  can  be  considered.  The  most  straightforward  con¬ 
ceptual  model  is  the  generalized  model.  The  clouds  are  located  in  space  by  a  pixel  designator 
(i,j)  and  the  cloud  top  height  (or  IR  brightness  temperature).  Each  pixel  can  conceivably  have 
clouds  at  several  heights.  Each  pixel  is  in  many  ways  independent  of  adjacent  pixels.  Minimal 
nephanalysis  or  pre-processing  of  the  cloud  images  is  required. 

An  alternative  conceptual  model  is  a  layered  model.  Here,  the  clouds  are  pre-processed  into  L 
finite  thickness  layers.  The  layers  can  be  of  either  fixed  or  floating  height  and  thickness.  Each 
layer  is  independent  of  the  other  layers.  The  layering  imposes  an  implicit  physical  relationship 
between  clouds  in  the  same  layer.  Complete  nephanalysis  and  considerable  pre-processing  of 
the  cloud  images  are  required. 

The  former  parameterization  is  the  most  general  and  provides  the  more  exact  representation  of 
the  available  cloud  data.  The  data  is  fully  represented  in  terms  of  cloud  fractions,  pixel  location, 
and  cloud  top  heights.  In  doing  so,  however,  it  stresses  the  forecast  algorithms  because  more 
parameters  must  be  predicted.  Generality  is  lost  if  layering  is  imposed,  but  the  forecast  algo¬ 
rithms  need  not  forecast  cloud  top  height  at  independent  pixels.  This  is  a  great  advantage  in  that 
clouds  at  many  adjacent  pixels  are  usually  related  in  origin  through  (spatially)  smoothly  varying 
atmospheric  conditions  (weather  systems  or  fronts). 

The  generalized  model  is  preferred  because  it  de-emphasizes  the  nephanalysis  and  will  be  inves¬ 
tigated  first.  The  layered  model  may  be  pursued  at  a  later  time.  The  final  choice  will  be  based 
upon  the  model  robustness  to  various  cloud  scenarios  and  upon  prediction  performance 
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SECTION  3 

MODEL  DEVELOPMENT  AND  REFINEMENT 


A  pixel-by-pixel  NN  algorithm  is  adopted  as  the  generalized  approach  to  cloud  forecasting.  The 
approach  is  based  upon  the  assumption  that  a  forecast  is  possible  based  solely  upon  the  past, 
current  and  approaching  clouds  to  a  single  pixel.  The  pixel-by-pixel  implementation  was  chosen 
to  minimize  and  simplify  the  data  input  into  the  neural  network.  Each  pixel  is  treated  separately 
and  is  only  loosely  connected  to  surrounding  pixels  through  the  latitude  and  longitude  inputs. 

No  formal  synoptic  weather  inputs  are  employed  in  this  approach. 

The  forecast  code  is  designed  around  a  unified  NN  (described  in  Section  3.4)  with  major 
weather  inputs  representing  advection  of  clouds,  persistence  of  clouds,  and  evolution  of  clouds 
along  with  several  influence  parameters.  The  general  structure  of  the  code  is  illustrated  in  Fig¬ 
ure  3- 1 .  This  final  form  is  somewhat  different  from  the  original  configuration  that  employed  a 
separate  NN  for  each  module  input  and  a  NN  to  combine  the  individual  forecasts.  The  latter  was 
abandoned  in  favor  of  the  unified  approach  to  reduce  the  redundancy  of  the  input  parameters. 
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Figure  3-1.  General  structure  of  the  code. 
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The  weather  inputs  are  divided  into  two  categories:  cloud  observation  data  and  meteorological 
parameter  input.  The  advection  and  persistence  modules  represent  the  former  while  the  evolu¬ 
tion  module  represents  the  latter.  For  this  study’s  purposes,  the  cloud  observation  data  is  taken 
from  SERCAA  level  3  nephanalysis.  Navy  Operational  and  Global  Atmospheric  Prediction 
System  (NOGAPS)  numerical  analysis  and  forecasts  are  used  for  the  meteorological  parameter 
inputs.  See  Appendix  A  for  a  description  of  SERCAA  and  NOGAPS  data. 

The  model  will  be  described  below  essentially  in  the  same  way  it  was  developed,  as  individual 
algorithms  that  were  eventually  merged  into  a  single  neural  network.  It  should  be  remembered, 
however,  that  although  the  algorithms  are  separately  described,  there  was  never  any  intention 
that  they  would  perform  well  as  stand  alone  modules. 

3.1  ADVECTION  DEPENDENCE. 

The  advection  module  has  significantly  evolved  from  its  original  incarnation.  The  stand-alone 
model  was  based  upon  somewhat  more  than  simple  advection  and  assumed  that  clouds  do  not 
depend  on  location,  season,  or  local  time  -  kinematic  assumption  -  and  that  the  clouds  will  fol¬ 
low  the  same  trajectory  and  undergo  the  same  changes  during  the  next  12  hours  as  during  the 
previous  12  hours.  The  original  NN  module  therefore  included  the  following  components 
(totaling  200  input  values): 

1.  Advection  of  the  current  time  clouds  to  the  forecast  time; 

2.  Current  clouds  at  hourly  increments  upwind; 

3.  Previous  clouds  at  hourly  increments  upwind. 

Only  the  first  input  was  retained  in  the  final  advection  inputs.  The  purpose  of  the  latter  two  in¬ 
puts  was  to  describe  how  the  clouds  were  changing  as  they  were  advected.  These  inputs  were 
dropped  because  of  noise.  Both  cloud  time  series  were  found  to  be  white  noise  sequences.  The 
noise  originated  from  two  sources:  errors  in  the  advection  trajectory  and  in  cloud  sampling. 
Both  were  aggravated  by  the  fact  that  the  data  is  dominated  by  broken  clouds  in  both  EASA  and 
CNSA. 

A  detailed  analysis  of  the  performance  of  the  advection  estimation  algorithm  resulted  in  a  major 
change  in  the  approach.  The  previous  approach  was  purposely  simple: 

•  Wind  vectors  were  estimated  for  the  previous  hour. 

•  Forecast  time  wind  vectors  were  obtained  by  simply  multiplying  the  1  hour  vectors  by  the 
forecast  time. 

•  Clouds  were  moved  based  upon  the  vectors. 
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It  was  hoped  that  the  neural  network  would  correct  for  poor  wind  estimates.  Instead,  it  was 
found  that  poor  wind  estimates  (when  advection  actually  was  the  primary  process)  degraded  the 
performance  of  the  persistence  and  evolution  inputs.  Based  upon  this,  two  inprovements  to  the 
advection  module  were  instituted.  A  progressive  wind  vector  advection  algorithm  replaced  the 
simple  single  wind  vector  prediction,  and  a  smoothing  algorithm  was  developed  for  the  wind 
field. 

The  previously  employed  advection  algorithm  was  simple  and  efficient  for  short-term  forecasts 
or  wind  fields  with  little  curvature.  When  significant  ciuvature  exists,  as  occurs  in  flow  about  a 
major  high  or  low  pressure  system,  the  simple  linear  approach  produces  extremely  poor  results. 
To  rectify  this  a  progressive  vector  advection  module  was  created. 

The  clouds  at  a  mesh  point  are  advected  using  the  following  algorithm  illustrated  in  Figure  3-2: 


Rgure  3-2.  In  cases  of  significant  curvature  to  the  wind  field,  the  progressive  vector  method  (A)  retains 
more  accuracy  than  the  linear  extrapolation  method  (B). 


•  The  wind  field  for  the  most  recent  hour  is  assumed  to  be  the  best  estimate  of  the  wind  field 
in  the  future. 

•  The  clouds  at  a  mesh  point  are  advected  forward  1  hour  in  time  to  a  new  mesh  point  using 
the  wind  vector  at  the  current  point. 

•  The  wind  vector  at  the  new  point  is  used  to  advect  the  clouds  forward  an  additional  1  hour  in 
time. 

•  The  previous  step  is  repeated  until  the  desired  forecast  time  is  attained. 
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This  procedure  better  retains  the  overall  shape  of  the  cloud  formations  as  long  as  the  current 
wind  field  accurately  reflects  the  future  wind  field  and  the  clouds  are  predominately  advected  (as 
opposed  to  evolved). 

The  correlation  analysis  results  in  an  inconsistent  wind  field,  e.g.  the  field  is  not  smooth  and 
vectors  often  cross.  To  help  alleviate  (but  not  completely  eliminate  this  problem)  a  smoothing 
process  has  been  added  to  the  wind  field  estmiiate.  We  have  advection  data  defined  on  a  2D  grid 
with  lots  of  gaps  -  cloudless  grid  points  with  no  good  advection  estimate.  A  weighted  least 
squares  smoother  interpolator  was  developed. 

The  input  data  is  on  a  grid  of  dimensions  n^Xriy,  with  grid  points  at  positions  x  =  l,2,,..,n,  and 
y  =  l,2,...,/iy .  The  input  data  consists  of  three  pieces  of  data  for  each  grid  point:  u{x,  y)  is  the  x 
component  of  the  advection,  v(x,y)  is  the  y  component,  and  w{x,y)  is  the  weight,  w  is  con- 

stracted  from  the  correlation  data:  for  good  pixels,  w  is  the  correlation  value  (between  0  and  1  - 
no  negative  values);  for  bad  pixels,  w  is  set  to  zero.  For  bad  pixels  we  should  also  set  u  and  v  to 
zero. 

The  data  is  fit  by  a  set  of  smooth  2D  basis  functions.  We’U  specify  the  basis  functions  later,  but 
for  now  let  n*  be  the  number  of  basis  functions  used,  and  the  basis  functions  are  Bf,{x,y)  for 

b  - 1,2,. . ./ij ,  defined  for  all  x  and  y.  The  smoothed  advection  functions  are  linear  superpositions 

of  the  basis  functions,  with  some  coefficients: 

“b 

0«no«h(x,y)  =  X^bBb(x,y) 

r  (3.1) 

V»ooUa(x,y)  =  S^^bBb(x.y) 

b=l 


The  coefficients  are  determined  by  doing  a  weighted  fit  to  the  advection  data.  This  is  the  stan¬ 
dard  linear  least  squares  fitting  result,  with  weights.  For  the  u  data,  define  the  variance 


x=l  y=l 


(3.2) 


Make  the  following  definitions  for  the  scalar  UU,  the  vector 5C/,  and  the  n^xn^  matrix 55: 
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(3.3) 


TtjJtly 

BUt  S— ]^>v(x,y)5j,(j:,y)«(j:,y) 

x,y 

BB^^.  =  —X  y)^b  {x,  y)By{x,  y) 

t? 

With  these  and  some  math,  the  variance  is 

<t/  =UU-2^a,BU,  +^a,a,.BB,^.  .  (3.4) 

b  bjb' 

Minimizing  this  with  respect  to  a*  gives  a  solution  in  terms  of  the  inverse  of  the  matrix  BB: 

a,=Y.BB-\y  BU,,  ,  (3.5) 

h* 


and  with  this  the  variance  is 

=UU -^BU^  BB-\jb>  BUy  .  (3.6) 

by 

The  variance  is  useful  to  calculate,  because  it  gives  us  a  feeling  for  how  well  we’re  fitting  the 
data. 

If  the  basis  functions  were  orthogonal,  so  that 

=-^'^w{x,y)B^{x,y)By{x,y)  (3.7) 

*,y 

was  zero  for  b^b' ,  then  the  matrix  would  be  diagonal  and  the  inversion  trivial.  However,  be¬ 
cause  of  the  arbitrary  weights  w  in  the  equation,  it  is  impossible  to  choose  orthogonal  basis 
functions.  We  will  just  choose  simple  basis  functions,  and  have  to  live  with  the  matrix  inver¬ 
sion. 

Figures  3-3  and  3-4  show  an  example  calculation  for  the  Mediterranean  wind  field.  First,  the 
north  and  east  components  of  the  wind  field  are  estimated  for  individual  cloudy  pixels  (Figures 
3-3a  and  c).  These  are  then  smoothed  and  interpolated  to  produce  the  wind  field  used  for  ad- 
vection  (Figures  3-3b  and  d).  The  results  of  the  advection  are  shown  in  Figure  3-4.  Here,  the 
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original  (T,,)  clouds  are  advected  12  hours  based  upon  the  old  and  the  new  smoothed  Tg  wind 
field.  The  results  are  compared  to  truth  12  hours  later.  Both  approaches  suffer  from  the  fact  that 
the  cloud  motion  is  not  dominated  by  advection  throughout  the  region;  the  clouds  over  southern 
Europe  (to  the  left)  are  not  moving  but  are  evolving.  Over  northern  Africa  where  advection  is 
more  dominant,  the  new  model  provides  a  better  advection  only  forecast. 


(a)  (b) 


(c) 


(d) 


Figure  3-3.  Cloud  advection  calculation  using  a  4"  order  fit  for  the  EMDA. 
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TO  Truth  New  Method 
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Figure  3-4.  Cloud  advection  results. 

3.2  PERSISTENCE  DEPENDENCE. 

Persistence  is  the  tendency  of  weather  to  change  slowly  or  to  predictably  repeat  itself  after  some 
time  interval.  A  forecast  that  merely  persists  current  weather  is  usually  the  best  short-term  (0  to 
3  hours)  predictor.  Some  current  tropical  forecast  models  rely  solely  on  simple  persistence  and  a 
variation  of  it,  diurnal  persistence.  Analyses  by  Salby,  et  al.  (1991)  indicate  that  a  better  persis¬ 
tence  forecast  might  be  obtained  by  including  a  more  complete  time  history  of  cloud  behavior. 

In  particular,  Salby,  et  al.  noted  strong  regionally-dependent  semi-diurnal  and  4-day  cycles  asso¬ 
ciated  with  easterly  waves  in  the  tropics.  A  cloud  history  function  that  spans  at  least  four  days 
might  improve  forecasts. 
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The  dominance  of  persistence  in  the  SERCAA  data  areas  is  best  represented  by  power  spectral 
analysis.  The  power  spectra  were  obtained  by  analyzing  2000  9-day  hourly  time  series  from 
random  locations. 

The  results  of  the  spectral  analysis  for  EASA  March  1993  tropical  and  mid-latitude  ocean  and 
land  are  shown  in  Figures  3-5  through  3-8.  For  cloud  fraction,  the  top  line  corresponds  to  total 
cloud  fraction  and  the  lines  below  are  fore  SERCAA  layers  1  through  4,  respectively.  For  cloud 
height,  starting  at  the  top,  the  lines  correspond  to  SERCAA  layers  1  through  4,  respectively. 

As  expected,  the  data  show  a  definite  diurnal  cycle  over  tropical  land  areas.  No  trends  of  any 
sort  are  apparent  over  ocean  areas  or  at  temperate  latitudes.  In  fact,  with  the  exception  of  the  di¬ 
urnal  peaks,  the  spectra  are  representative  of  a  white  noise  process  with  a  very  long  term  trend 
superimposed.  The  results  for  layers  3  and  4  represent  pure  white  noise  processes.  These  results 
do  not  preclude  the  presence  of  long  period  cycles  but  more  likely  reflect  poor  resolution  of  the 
lower  cloud  layers  by  the  SERCAA  nephanalysis. 

The  above  results  do  require  a  significant  modification  to  the  anticipated  persistence  modeling 
approach.  The  proposed  approach  called  for  an  auto-regressive  model  using  a  6-day  time  series 
to  capture  the  easterly  wave  4-day  cycle.  The  data  clearly  does  not  support  such  a  model.  Lim¬ 
ited  data  also  precludes  model  dependence  upon  geographic  region  and  time  of  year.  Given 
these  constraints  a  simpler  approach  to  a  persistence  model  was  adopted  that  only  includes  a  12 
hour  cloud  history  and  an  average  diurnal  input 

The  12  hour  cloud  history  is  simply  input  by  including  the  current  time  cloud  characterization 
along  with  cloud  characterization  for  1, 3, 6,  and  12  hours  past  This  data  is  meant  to  establish 
the  near-time  trend  in  cloud  parameters. 

The  diurnal  cycle  in  cloud  parameters  is  input  by  averaging  the  cloud  parameters  from  24, 48, 
and  72  hours  before  Hoq  forecast  time.  This  approach  appears,  and  is,  simple  but  was  chosen  for 
its  robustness.  The  diurnal  input  can  be  averaged  in  several  different  ways  and  still  be  input  A 
recursive  filter  with  a  three  day  weight  is  an  obvious  choice  for  an  operational  system.  The 
choice  of  weighting  should  be  based  upon  information  available  upon  longer  term  weather 
trends.  A  semi-diurnal  or  4  day  cycle  can  be  input  instead  of  the  diurnal  input 
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Figure  3-8.  Spectral  analysis  of  cloud  history  in  EASA  March  1993  over  mid-latitude  land  regions 


Table  3-1  summmzes  both  the  minimal  and  normal  data  requirements  for  the  persistence  algo¬ 
rithm.  The  minimum  requirements  refer  to  data  requirements  necessary  for  a  cold  start.  There¬ 
fore  the  model  can  be  started  with  only  the  previous  day’s  data.  Normal  operation  requires  three 
previous  days  of  data. 


Table  3-1 .  Persistence  model  data  requirements. 


Minimum 

Requirements 

Normal  Requirements 

to 

to 

t,  - 1  (hours) 

to  - 1  (hours) 

to-3 

to-3 

to-6 

to-6 

to -12 

to*12 

t^.-24 

toH«..,-av(24,48,72) 

Three  quantities  are  input  for  each  of  the  times  (except  diurnal)  in  Table  3-1.  For  each  identified 
layer  of  clouds  these  include:  (1)  time  delay  from  (2)  cloud  fraction  at  the  time  delay;  (3) 
cloud  top  temperature  at  the  time  delay. 

3.3  EVOLUTION  DEPENDENCE. 

Like  persistence,  the  evolution  algorithm  depends  on  local  characteristics  such  as  topography, 
geography,  latitude  and  time-of-day,  but  whereas  the  persistence  and  advection  algorithms 
merely  extrapolate  cloud  behavior  in  time  and  space,  the  evolution  algorithm  exploits  atmos¬ 
pheric  dynamics  to  predict  clouds  by  engaging  the  output  of  a  numerical  weather  prediction 
(NWP)  model.  Since  the  military  intends  to  consolidate  all  NWP  functions  under  FNMOC,  and 
since  NOGAPS  is  the  Navy's  global  forecast  model,  it  is  likely  that  NOGAPS  data  will  be  the 
source  of  NWP  data  in  future  AF  cloud  forecast  systems.  Therefore,  the  decision  was  made  to 
rely  exclusively  on  NOGAPS  as  the  source  for  NWP  data. 

Since  NWP  models  generally  do  not  predict  clouds  directly,  it  is  necessary  to  relate  the  model 
output  data  to  the  cloud  fields.  The  standard  procedure  for  doing  this  is  termed  Model  Output 
Statistics  (MOS).  The  first  step  in  the  MOS  approach  is  to  define  a  set  of  predictors  based  on 
NWP  forecast  data.  Predictors  are  not  limited  to  NWP  data  and  may  include,  for  example,  the 
current  observed  cloud  fields.  The  predictors  are  then  related  to  the  forecast  clouds 
(predictands)  by  means  of  a  regression  analysis  on  historical  data.  Our  approach  is  similar  ex- 
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cept  that  we  use  a  neural  network  to  relate  predictors  to  predictands.  The  advantage  of  the  neu¬ 
ral  network  approach  is  that  possible  nonlinear  and  cross-product  relationships  between 
predictors  are  automatically  ferreted  out  by  the  neural  network  to  produce  a  better  estimate  of 
the  predictand.  The  predictors  are  drawn  from  a  pool  of  potential  predictors  that  include  ele¬ 
mental  and  derived  variables  based  on  NOGAPS  data. 

There  is  a  large  disparity  in  the  resolutions  of  predictors  based  on  NOGAPS  data  and  predic¬ 
tands  based  on  SERCAA  data.  NOGAPS  provides  a  global  analysis  and  a  12-hour  forecast 
twice  daily  at  00  and  12  Z  on  a  2.5  x  2.5  degree  latitude/longitude  grid.  The  resolution  at  60®  N 
is  139  km,  decreasing  to  278  km  at  the  equator.  In  contrast,  SERCAA  data  is  available  hourly 
(nominally)  and  the  resolution  of  16th-mesh  SERCAA  data  at  60®  N  is  23.8  km,  increasing 
toward  the  equator.  The  current  NOGAPS  operational  model  is  higher  resolution  (0.75  x  0.75 
degree)  but  unfortunately  no  archived  data  is  available  for  the  1993  and  1994  times  correspond¬ 
ing  to  the  SERCAA  data  sets.  Figure  3-9  is  an  example  of  a  NOGAPS  analysis  of  mean  sea 
level  (MSL)  pressure.  The  EASA  region  is  outlined  at  NOGAPS  resolution.  (Note,  there  is  an 
error  in  NOGAPS  depiction  of  the  century.  It  should  show  1993  rather  than  1893.) 

Table  3-2  shows  the  variables  considered  in  the  search  for  cloud  field  predictors.  The  first  6 
variables  are  elemental  NOGAPS  model  output  data.  The  remaining  variables,  beginning  with 
divergence,  are  derived  from  the  elemental  variables.  The  height  variable  refers  to  the  height  of 
the  pressure  (hPa)  surface.  All  variables,  other  than  MSL  pressure  and  surface  (SFC)  tempera¬ 
ture,  are  defined  on  pressure  surfaces  listed  across  the  top  to  the  table.  Vapor  pressure  (and  thus 
relative  humidity)  is  available  only  to  300  hPa.  Divergence  and  vorticity  are  associated  with 
vertical  motion  in  the  atmosphere  at  mid-  to  upper-latitudes  and  therefore  likely  to  be  correlated 
with  clouds.  Relative  humidity  is  obviously  linked  with  cloudiness.  Temperature  advection, 
vorticity  advection,  wind  speed,  and  wind  shear  are  often  associated  with  developing  storm  sys¬ 
tems.  Temperature  difference  and  thickness  between  pressure  surfaces  are  measures  of  atmos¬ 
pheric  stability’ 

Each  predictor  listed  in  Table  3-2  is  used  in  three  different  ways.  First,  we  simply  takp.  the  pre¬ 
dictor  defined  by  the  12-hour  forecast  as  it  stands.  Second,  we  subtract  the  zonal  average  from 
the  12-hour  forecast  value.  Last,  we  define  a  trend  based  on  the  predictor  at  forecast  time  and  its 
12-hour  forecast  value.  All  calculations  are  performed  on  the  NOGAPS  2.5  x  2.5  degree  grid 
and  interpolated  to  the  SERCAA  16th-mesh  grid.  Predictors  are  only  compared  to  total  cloud 
fraction  and  no  attempt  is  made  to  discriminate  predictors  as  a  function  of  cloud  layer. 
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Figure  3-9.  NOGAPS  analysis  of  mean  sea  level  pressure. 


Table  3-2.  Evolution  module  predictors.* 
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height,  geography,  or  latitude  zone.  The  3  forms  of  15  predictors  at  17  heights  result  in  pool  of 
618  potential  predictors  (not  all  variables  are  available  at  all  heights). 

The  next  step  is  to  identify  the  predictors  that  show  the  highest  degree  of  association  with  the 
predictands.  Several  measures  of  association  were  considered.  One  approach  is  to  compare  the 
chi-square,  entropy,  and  Spearman  rank  correlation  values  calculated  from  a  contingency  table 
of  predictor  versus  predictand.  The  analyses  based  on  contingency  tables  all  produced  similar 
results.  For  example,  if  the  chi-square  value  was  high,  then  so  were  the  other  measures  of  asso¬ 
ciation.  We  also  performed  a  matrix  correlation  between  predictor  and  predictand.  The  best 
correlated  predictors  produced  by  this  analysis  significantly  differed  from  those  ranked  high 
based  on  the  contingency  table.  Visual  comparisons  of  predictor  and  predictand  in  both  cases 
led  us  to  choose  correlation  as  the  best  measure  of  association. 

The  correlation  between  predictor  and  predictand  was  then  calculated  for  aU  times  in  each  data 
set.  The  absolute  values  of  correlation  were  averaged  and  ranked.  Predictors  that  were  related 
were  eliminated  to  reduce  redundancy.  For  example,  if  vapor  pressure  and  relative  humidity  at  a 
given  height  were  both  found  to  be  highly  correlated  with  total  cloud  fraction,  then  only  the 
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higher  ranked  predictor  was  kept  Similarly,  only  the  higher  ranked  zonal  wind  or  total  wind 
speed  was  kept,  since  the  zonal  wind  vector  usually  accounts  for  most  of  the  wind  speed  magni¬ 
tude.  Also,  only  the  higher  ranked  fundamental  variable  or  its  zonal  perturbation  was  kept,  not 
both.  Table  3-3  shows  the  25  top-ranked  predictors  for  the  March  and  July  EASA  data  sets. 
Figures  3-10  and  3-1 1  illustrate  the  close  correlation  between  total  cloud  fraction  observed  at  00 
Z  on  29  July  1993  and  the  relative  humidity  based  on  a  12-hour  forecast  made  at  12  Z  on  28  July 
1993  for  EASA. 

Table  3-3.  25  top-ranked  predictors  for  EASA  data  sets. 
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Once  the  best  predictors  were  identified,  a  set  of  vectors  was  generated  for  neural  network 
training.  Each  training  vector  contains  37  input  and  16  output  elements.  The  input  elements 
consists  of  predictors  (25),  current  cloud  fraction  fields  (4),  elevation  (1),  time-of-day  (2),  lati¬ 
tude  (1),  longitude  (2)  and  terrain  slope  (2).  The  output  elements  are  4  cloud  fraction  fields  at  3, 
6, 9,  and  12  hours  (16).  The  25  top-ranked  predictors  were  first  calculated  on  the  2.5  x  2.5  de¬ 
gree  NOGAPS  grid  and  then  interpolated  to  the  16th-mesh  SERCAA  grid.  Predictors  were 
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Figure  3-1 0.  SERCAA  total  cloud  fraction  observed  at  00  Z  on  29  July  1 993. 
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Figure  3-1 1 .  NOGAPS  relative  humidity  12-hour  forecast  for  Rgure  3-9. 
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selected  from  500  random  locations  within  the  region  for  each  time  in  the  data  set.  The  times 
used  for  training  are  determined  by  the  NWP  forecast  cycle.  Only  times  where  NWP  data  is 
available  at  the  forecast  time  (Figure  3- 12a)  are  used.  The  model  has  not  been  tested  for  times 
where  NWP  data  is  not  synchronized  with  the  forecast  (Figure  3- 12b).  The  last  12-hour  period 
in  the  data  set  encompassing  a  NWP  forecast  cycle  is  reserved  for  validation.  There  are  typi¬ 
cally  15  times  in  each  data  set,  excluding  the  last  12-hour  period,  where  NWP  data  is  synchro¬ 
nized  with  the  forecast  time.  As  a  result,  the  training  set  for  each  data  set  consists  of  about  7500 
(500  X  15)  training  vectors. 
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Figure  3-12.  Evolution  data  feed;  (a)  forecast  cycle  tested  in  the  current  model  configuration, 

(b)  example  of  another  forecast  cycle  the  model  must  eventually  handle. 

Figures  3-13  and  3-14  show  3, 6, 9,  and  12-hour  forecasts  produced  by  the  evolution  algorithm 
alone  for  EASA.  The  forecasts  are  produced  for  the  last  12-hour  period  in  each  data  set.  The 
neural  network  weight/bias  sets  were  not  trained  on  this  data,  so  these  forecasts  are  an  indication 
of  how  well  the  neural  network  performs  on  new  data.  Forecast  times  are  00  Z  on  30  March 
1993  (Julian  day  89)  and  31  July  1993  (JuUan  day  212)  for  EASA.  Images  are  displayed  in  the 
16th-mesh  coordinate  system  so  each  panel  is  oriented  with  the  equator  near  the  top  and  60“  N 
near  the  lower  right  comer.  SERCAA  data  was  not  provided  for  southern  latitudes  and  this 
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Figure  3-13.  Evolution  module  forecast  for  EASA  for  day  89:  (a)  total  cloud  fraction,  and  (b)  layer  1  cloud  fraction. 
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Figure  3-13.  Evolution  module  forecast  for  EASA  for  day  89;  (a)  total  cloud  fraction,  and  (b)  layer  1  cloud  fraction  (Continued). 
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Figure  3-14.  Evolution  module  forecast  for  EASA  for  day  212:  (a)  total  cloud  fraction,  and  (b)  layer  1  cfoud  fraction  (Continued). 


accounts  for  the  arc  near  the  top  of  each  panel  defined  by  the  absence  of  clouds.  Wedges  of  dis¬ 
continuous  cloud  fraction  in  the  observed  clouds  are  an  artifact  of  the  SERCAA  merge  process. 
The  wedges  are  the  result  of  improper  geometric  corrects  or  calibration  of  multiple  satellites 
before  merge.  Discontinuities  in  the  forecast  images  are  visible  where  there  is  a  transition 
between  weight/bias  set  predictions  for  a  specific  latitude  zone  and  geography.  An  example  is 
the  transition  between  latitude  zones  at  25°  N  across  the  center  of  most  forecasts  images. 

These  results  are  preliminary  and  are  only  indicative  of  the  performance  of  the  evolution  algo¬ 
rithm.  Evolution  inputs  will  be  further  refined  when  the  three  algorithms  are  combined.  There 
are  some  encouraging  features  from  the  evolution  only  algorithm.  Figure  3-13  indicates  reason¬ 
able  cloud  development  over  land  in  the  tropics.  The  light  area  visible  near  the  top  and  center- 
right  of  the  forecast  panels  in  Figure  3-13  shows  cloud  development  coincident  with  Borneo. 

The  outline  of  Borneo  is  apparent  in  the  observed  clouds  (top  panels),  but  not  as  sharply  defined. 
Similarly,  cloud  development  is  predicted  over  the  Philippines  but  not  the  surrounding  ocean. 

At  mid-latitudes  (lower  half  of  Figures  3-13  and  3-14),  the  general  pattern  of  the  observed  cloud 
field  is  predicted  but  features  are  not  as  well  defined  as  in  the  tropics. 

3.4  COMBINED  NEURAL  NETWORK. 

Combining  the  individual  algorithms  consisted  of  two  interactive  parts.  First,  the  general  form 
of  the  NN  was  established.  Second,  the  final  selection  of  the  input  vectors  was  made  based  upon 
NN  prediction  performance.  Both  are  discussed  in  the  following  sections. 

The  fully-connected,  feed-forward-back-propagation  NN  shown  in  Figure  3-15  was  adopted  for 
use  on  this  project  The  NN  has  28  (the  final  number  of  inputs)  input  nodes,  two  hidden-layers 
(12  and  10  nodes  each)  and  three  output  nodes  for  a  total  of  430  degrees  of  freedom.  Several 
other  variations  on  the  number  of  hidden  layers  and  the  number  of  nodes  in  the  hidden-layers 
were  attempted.  This  was  by  no  means  an  exhaustive  study  but  several  trends  pointed  toward  the 
current  selection.  Greatly  increasing  the  number  of  nodes  in  the  hidden-layers  significantly  im¬ 
proved  the  training  error  but  not  the  prediction  error.  A  single  hidden-layer  performed  more 
poorly.  Reducing  the  hidden-layer  nodes  degraded  the  prediction  capability. 

3.4.1  Neural  Network  Training. 

Training  takes  place  on  a  batch  of  input  vectors  selected  at  random  from  the  population  of 
training  vectors.  The  objective  of  training  is  to  reduce  the  sum  squared  difference  between  the 
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Figure  3-15.  Neural  network  configuration. 

neural  network  output  and  target  cloud  fields.  The  weight/bias  set  giving  the  least  error  is 
sought  using  a  line  minimization  approach.  Line  minimization  attempts  to  quickly  hunt  down 
the  minimum  of  a  two-dimensional  curve  by  successively  fitting  parabolas  to  a  region  that 
brackets  the  minimum.  This  is  usually  more  efficient  than  iterative  methods  where  the  minimum 
is  found  by  taking  a  series  of  steps  in  the  direction  of  greatest  decreasing  error  (gradient  de¬ 
scent),  particularly  if  the  minimum  lies  within  a  broad,  shallow  region  of  the  curve.  The  error 
surface  is  actually  multidimensional,  the  dimension  depending  on  the  number  of  weights  and  bi¬ 
ases  in  the  network.  The  search  for  a  global  minimum  on  the  multidimensional  error  surface  is 
reduced  to  a  series  two-dimensional  searches  by  iteratively  finding  the  minimum  in  first  one  di¬ 
rection,  then  another.  Gradient  descent  moves  in  the  direction  of  maximal  error  reduction.  We 
employ  a  more  efficient  search  that  proceeds  in  the  so-called  conjugate  gradient  direction,  which 
is  a  compromise  between  the  previous  search  direction  and  that  of  gradient  descent.  The  path 
defined  by  conjugate  gradient  directions  tends  to  approach  the  minimum  smoothly,  eliminating 
inefficient  zigzags  inherent  in  the  gradient  descent  approach. 

The  NN  was  extensively  trained  on  the  best  and  longest  data  set,  the  first  six  days  of  EMDA  data 
(days  73  through  78).  The  following  procedure  was  followed: 

1.  An  input  file  was  created  for  all  descriptors  of  each  available  (some  were  missing) 
hourly  image. 

2.  One-third  of  all  pixels  were  randomly  selected  from  the  first  three  days  of  data. 

3.  The  NN  was  trained  for  100  iterations  on  this  training  set 

4.  The  process  was  repeated  for  the  second  three  days  of  data  but  the  training  was  started 
with  the  previously  obtained  nodal  weights. 
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The  above  procedure  guaranteed  that  training  included  a  distribution  of  available  latitudes,  lon¬ 
gitudes,  times  of  day  and  land  types.  (Dividing  the  data  into  two  three-day  pieces  was  based 
upon  a  computer  limitation.)  The  NN  was  trained  on  a  total  of  approximately  500,000  inde¬ 
pendent  input  vectors. 

Training  was  stopped  after  100  iterations  in  all  cases.  It  was  found  that  95%  of  the  training  was 
accomplished  in  the  first  25  to  35  iterations.  Little  improvement  in  training  was  realized  after 
that  point.  In  general,  the  training  error  varied  from  15  to  20%  when  raw  data  was  used  as  in¬ 
put;  a  5%  improvement  was  realized  when  median  filtered  data  was  used  for  training  (see  Sec¬ 
tion  4.2). 

The  greatest  shortcoming  of  the  training  was  a  lack  of  variety  in  the  cloud  cover.  A  quick 
perusal  of  the  cloud  images  in  Appendix  A  results  in  the  conclusion  that  the  data  set  is  best  char¬ 
acterized  by  scant  cloud  cover.  For  the  most  part,  clouds  are  confined  to  coastal  areas  around 
Italy,  Greece  and  Turkey  with  variable  clouds  in  North  Afiica.  No  instances  of  heavy  clouds  are 
recorded.  More  robust  training  is  required  in  the  future  if  better  prediction  performance  is  to  be 
achieved. 

The  same  procedure  was  followed  for  training  the  NN  on  the  CNSA  data.  As  indicated  in  Table 
A-1,  Appendix  A,  far  fewer  image  hours  of  data  were  available.  To  overcome  the  lack  of  data, 
the  training  commenced  using  the  weights  obtained  from  the  EASA  training. 

3.4.2  Training  Vector  Definition. 

The  final  input  vector  definition  was  selected  based  upon  an  input  parameter  sensitivity  study. 
The  most  straightforward  method  of  determining  which  input  parameters  are  important  is  to 
selectively  omit  parameters  from  the  training  process  (Butler  and  Meredith,  1996).  The  removal 
of  a  parameter  can  affect  NN  performance  in  three  ways:  1)  if  the  parameter  is  important  the 
NN  performance  is  degraded,  2)  if  the  parameter  is  unimportant,  the  NN  performance  is 
unchanged,  and  3)  if  the  parameter  acts  like  a  noise  source,  the  NN  performance  is  improved. 
Parameters  that  faU  into  the  last  category  should  be  eliminated.  Parameters  that  fall  into  the  sec¬ 
ond  category  should  be  strongly  considered  for  removal  because  their  inclusion  increases  the 
training  requirements  and  adds  undesired  degrees-of-freedom  to  the  network. 

A  detailed  study  of  all  possible  parameter  combinations  was  obviously  not  performed.  Instead, 
the  study  focused  on  the  persistence  input,  the  evolution  parameters  and  the  influence  parameters 
(latitude,  longitude,  land  type,  elevation,  etc.).  Table  3-4  presents  the  qualitative  results  of  the 
study.  Two  important  results  emerge.  First,  the  elevation  input  degrades  the  NN  performance. 
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Second,  individually  removing  any  of  the  many  evolution  parameters  does  not  affect  the  NN 
performance,  however,  removing  all  of  the  evolution  parameters  degrades  NN  performance. 

Based  upon  these  results,  the  evolution  parameters  were  re-evaluated  in  terms  of  the  applicable 
atmospheric  physics  to  select  a  much  reduced  input  parameter  set.  The  primary  atmospheric 
condition  that  favors  cloud  formation  is  the  uplift  of  warm  moist  air.  This  can  be  characterized 
by  the  NOGAPS  relative  humidity,  velocity  divergence,  and  temperature  parameters  at  various 
altitudes.  A  new  evolution  predictor  set  of  relative  humidity,  velocity  divergence  and  tempera¬ 
ture  at  five  altitudes  (Sea  level,  100, 300, 500  and  850  hPa)  was  tested.  Five  altitudes  provided 
redundant  information.  Two  altitudes  ( 850  and  500  mBars)  provided  the  best  compronoise. 
Temperature  was  found  to  provide  no  meaningful  NN  performance  and  was  eliminated  from  the 
predictors.  The  final  predictors  are  listed  in  Table  3.5.  The  basic  results  reflect  the  most  impor¬ 
tant  predictors  found  by  others,  hi  reviewing  the  predictors  (used  and  not  used)  it  is  important  to 
remember  that  these  were  chosen  based  upon  NN  performance  with  a  particular,  limited  set  of 
tropical  cloud  data.  Other  scenarios  might  require  some  additions  or  adjustments  to  these  pre¬ 
dictors.  More  extensive  NN  training  might  reduce  the  training  error  and  result  in  additional  pre¬ 
dictors  becoming  important 
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Table  3-4.  Skill  scores  for  NN  forecasts  (cloud  fractfon). 


Training  Data 

Sharp 

Obs. 

Sharp 

For. 

Brier 

ESS 

G20/20 

2  hour  forecast 

air 

0.97 

0.67 

0.12 

0.26 

0.62 

elevation  removed 

0.97 

0.77 

0.13 

0.33 

0.67 

lat/lon  removed 

0.97 

0.77 

0.14 

0.21 

0.67 

longitude  removed 

0.97 

0.70 

0.13 

0.32 

0.62 

land  type  removed 

0.97 

0.54 

0.15 

0.27 

0.50 

evol  removed 

0.97 

0.71 

0.11 

0.32 

0.65 

evol  removed  except  div850 

0.97 

0.74 

0.12 

0.32 

0.67 

elev.  evolution  <500  removed 

0.97 

0.74 

0.12 

0.29 

0.66 

div  @  850,500  onlyf 

0.97 

0.70 

0.12 

0.22 

0.64 

rh  ®  850,500  onlyr 

0.97 

0.71 

0.12 

0.36 

0.65 

tmp  @  850,500  only+ 

0.97 

0.74 

0.11 

0.39 

0.67 

temp  &  div  @  850,500  only+ 

0.97 

0.75 

0.12 

0.33 

0.67 

rh  &  div  @  850,500  onlyt 

0.97 

0.73 

0.12 

0.39 

0.66 

tmp  &  rh  @  850,500  only+ 

0.97 

0.76 

0.12 

0.32 

0.68 

evol  @  850,500  only+ 

0.97 

0.68 

0.12 

0.29 

0.63 

3  hour  forecast 


all* 

0.97 

0.67 

0.13 

0.28 

0.60 

elevation  removed 

0.97 

0.75 

0.13 

0.31 

0.66 

lat/lon  removed 

0.97 

0.78 

0.14 

0.19 

0.67 

longitude  removed 

0.97 

0.68 

0.13 

0.32 

0.61 

land  type  removed 

0.97 

0.51 

0.16 

0.25 

0.47 

evol  removed 

0.97 

0.69 

0.12 

0.32 

0.63 

evol  removed  except  div850 

0.97 

0.71 

0.12 

0.30 

0.64 

elev.  evolution  <500  removed 

0.97 

0.71 

0.12 

0.31 

0.64 

div  @  850,500  onlyf 

0.97 

0.68 

0.12 

0.22 

0.63 

rh  @  850,500  onlyr 

0.97 

0.69 

0.13 

0.33 

0.63 

tmp  ®  850,500  only! 

0.97 

0.73 

0.12 

0.31 

0.66 

temp  &div  ®  850,500  onlyf 

0.97 

0.74 

0.12 

0.33 

0.66 

rh  &  div  ®  850,500  onlyf 

0.97 

0.71 

0.13 

0.33 

0.64 

tmp  &  rh  ®  850,500  onlyf 

0.97 

0.75 

0.12 

0.34 

0.67 

evol  ®  850,500  onlyf 

0.97 

0.66 

0.12 

0.30 

0.61 

6  hour  forecast 

all* 

0.97 

0.64 

0.13 

0.30 

elevation  removed 

0.97 

0.75 

0.14 

0.31 

0.66 

lat/lon  removed 

0.97 

0.74 

0.14 

0.17 

0.64 

longitude  removed 

0.97 

0.68 

0.14 

0.29 

0.60 

land  type  removed 

0.97 

0.48 

0.18 

0.21 

0.44 

evol  removed 

0.97 

0.61 

0.13 

0.32 

0.57 

evol  removed  except  div850 

0.97 

0.68 

0.13 

0.28 

0.63 

elev.  evolution  <500  removed 

0.97 

0.67 

0.12 

0.30 

0.61 

div  ®  850,500  onlyf 

0.97 

0.65 

0.13 

0.27 

0.59 

rh  ®  850,500  onlyf 

0.97 

0.67 

0.14 

0.30 

0.60 

tmp  @  850,500  onlyH" 

0.97 

0.73 

0.13 

0.30 

0.66 
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Table  3-4.  Skill  scores  for  NN  forecasts  (cloud  fraction)  (Continued). 


6  hour  forecast  (continued) 


temp  &  div  @  850,500  onlyt 

0.97 

0.73 

0.13 

0.26 

0.65 

rh  &  div  @  850,500  only+ 

0.97 

0.70 

0.14 

0.30 

0.63 

tmp  &  rh  @  850,500  only+ 

0.97 

0.74 

0.13 

0.26 

0.66 

9  hour  forecast 


air 

0.97 

0.59 

0.13 

0.37 

0.55 

elevation  removed 

0.97 

0.74 

0.13 

0.31 

0.66 

iat/ion  removed 

0.97 

0.77 

0.14 

0.26 

0.66 

longitude  removed 

0.97 

0.72 

0.14 

0.29 

0.62 

land  type  removed 

0.97 

0.49 

0.17 

0.18 

0.45 

evol  removed 

0.97 

0.59 

0.14 

0.27 

0.54 

evol  removed  except  div850 

0.97 

0.73 

0.13 

0.32 

0.66 

elev.  evolution  <500  removed 

0.97 

0.65 

0.12 

0.38 

0.59 

div  @  850,500  oniy+ 

0.97 

0.72 

0.12 

0.24 

0.65 

rh  @  850,500  oniyr 

0.97 

0.69 

0.14 

0.27 

0.61 

tmp  @  850,500  oniy+ 

0.97 

0.78 

0.13 

0.33 

0.68 

temp  &  div  @  850,500  oniy+ 

0.97 

0.71 

0.13 

0.22 

0.63 

rh  &  div  @  850,500  onlyf 

0.97 

0.71 

0.13 

0.32 

0.64 

tmp  &  rh  @  850,500  onlyf 

0.97 

0.73 

0.13 

0.33 

0.65 

evoi  @  850,500  only+ 

0.97 

0.70 

0.12 

0.32 

0.64 

12  hour  forecast 


ar 

0.97 

0.62 

0.13 

0.28 

0.56 

elevation  removed 

0.97 

0.72 

0.13 

0.33 

0.65 

Iat/ion  removed 

0.97 

0.81 

0.15 

0.17 

0.67 

longitude  removed 

0.97 

0.75 

0.14 

0.27 

0.64 

land  type  removed 

0.97 

0.57 

0.17 

0.18 

0.51 

evol  removed 

0.97 

0.62 

0.13 

0.25 

0.56 

evol  removed  except  div850 

0.97 

0.81 

0.13 

0.23 

0.71 

elev.  evolution  <500  removed 

0.97 

0.67 

0.12 

0.22 

0.60 

div  @  850,500  onlyf 

0.97 

0.74 

0.13 

0.18 

0.65 

rh  @  850,500  oniyr 

0.97 

0.68 

0.13 

0.24 

0.61 

tmp  @  850,500  only! 

0.97 

0.81 

0.13 

0.28 

0.71 

temp  &  div  @  850,500  onlyf 

0.97 

0.72 

0.13 

0.32 

0.64 

rh  &  div  @  850,500  onlyf 

0.97 

0.73 

0.13 

0.30 

0.65 

tmp  &  rh  @  850,500  onlyf 

0.97 

0.76 

0.13 

0.30 

0.67 

evol  @  850,500  onlyf 

0.97 

0.76 

0.13 

0.26 

0.68 

*This  set  has  a  dupiicate  tO  parameter  inciuded. 
+These  sets  have  eievation  removed 
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Table  3.5.  Final  Predictors 


_ NN  Predictors _ 

UT  of  forecast  time 
At  before  forecast 
Latitude 
Longitude 

Advected  cloud  fraction 
Advected  cloud  top  temperature 
TCFatto 
CTTatto 
TCFatto-l  hour 
CTT  at  VI  hour 
At  from  forecast 
TCF  at  V3  hour 
CTT  at  V3  hour 
At  from  forecast 
TCF  at  V6  hour 
CTT  at  to-6  hour 
At  from  forecast 
TCF  at  VI 2  hour 
CTT  atV12hour 
At  from  forecast 
Cbuds/no  clouds  flag 
Relative  humidity  @  850  hPa 

Relative  humidity  @  500  hPa 

Velocity  Divergence  @  850  hPa 

Velocity  Divergence  @  500  hPa 

TCF  at  V24  hours 
(Averaged  over  past  3  days) 

CTT  at  V24  hours 
(Averaged  over  past  3  days) 

Land  type 


SECTION  4 

MODEL  PERFORMANCE 


Model  performance  is  not  easily  quantifiable  due  to  the  variability  of  the  realizable  cloud 
scenes.  At  this  point  in  development  and  training,  model  performance  is  not  robust.  Neverthe¬ 
less,  the  model  performance  must  be  quantified.  Toward  this,  the  next  two  sections  discuss  the 
performance  of  the  pixel-by-pixel  model  trained  on  raw  images  or  median  filtered  images. 

Six  forecast  times  not  originally  included  in  the  NN  training  were  selected  for  evaluation.  The 
times  were  selected  to  provide  the  largest  cloud  scene  variability  that  this  data  set  allows.  The 
model  forecasts  were  then  characterized  by  standard  skill  score  figures-of-merit:  equitable  skill 
score  (ESS),  20/20  score.  Brier’s  score,  and  scene  correlation.  For  a  discussion  of  each  see 
Appendix  B. 

Before  reviewing  the  results,  however,  it  will  be  beneficial  to  reconsider  the  discussion  in  Sec¬ 
tion  2  dealing  with  the  conceptual  model  of  the  clouds.  It  was  stated  that  only  some  fraction  of 
the  clouds  were  predictable  while  the  remainder  were  considered  to  be  random.  To  get  an  idea 
of  how  this  affects  the  skill  scores.  Figure  4- 1  shows  two  cloud  scenes  separated  by  one  hour 
that  were  analyzed.  The  skill  scores  obtained  are  shown  in  Table  4-1.  The  20/20  score  indicates 
that  93%  of  the  pixels  are  within  +20%  of  each  other  and  the  Brier  score  indicates  a  small  rms 
error;  this  is  what  the  eye  sees.  Even  though  the  cloud  scenes  appear  to  be  remarkably  similar 
the  ESS  is  quite  low  because  the  score  reflects  the  detail  not  easily  seen  by  eye.  The  correlation 
score  is  higher  than  the  ESS  because  it  does  not  penalize  for  incorrect  forecasts. 


(a)  7909  (b)7910 

Figure  4-1 .  Consecutive  hours  of  total  cloud  cover  in  EMDA. 


37 


Table  4-1 .  Skill  scores  comparing  consecutive  hours  of  total  cloud  cover  from  EMDA 
shown  In  Figure  4-1 . 


Skill  Score 

Value 

Brier 

.04 

ESS 

.27 

20/20 

.93 

Correlation 

.41 

The  same  pattern  of  scores  is  expected  when  comparing  forecasts  to  truth  in  Sections  4.1  and 
4.2.  The  overall  picture  will  be  statistically  correct  but  it  will  be  smeared  due  to  a  tendency  of 
the  NN  to  forecast  an  average  value  for  the  random  component.  The  smearing  results  in  a  very 
small  or  negative  ESS,  especially  when  the  full  resolution  of  the  data  is  analyzed.  A  better  ESS 
will  be  achieved  using  the  median  filtered  data.  Large  ESS  cannot  be  expected  when  forecasting 
with  temporally  or  spatially  aliased  images. 

4.1  PIXEL-BY-PIXEL  NN. 

The  training  vector  sets  and  the  training  process  are  described  in  Section  3.4.  Raw  (unfiltered, 
unsmoothed)  pixel  data  from  days  71  through  78  were  input  for  training.  Figure  4-2  is  an  exam¬ 
ple  of  a  prediction  for  day  79.  Table  4-2  shows  the  resulting  skill  scores. 

The  rms  error  (the  square  root  of  the  Brier  score)  range  from  17%  to  27%,  generally  increasing 
with  forecast  dxuration.  The  basic  problem  with  the  forecasts  is  the  smearing  of  the  clouds.  Not 
shown  in  the  color  scale  is  the  fact  that  there  is  a  low  (<10%)  cloud  fraction  forecast  over  virtu¬ 
ally  the  entire  area.  As  the  length  of  the  forecast  increases,  the  pervasive  low  cloud  fraction 
grows. 

The  low  cloud  fraction  does  not  affect  the  20/20  scores  for  short  forecasts  but  is  the  primary  rea¬ 
son  it  drops  at  longer  forecasts.  The  background  level  comes  up.  Also,  the  NN  fails  to  forecast 
high  cloud  fractions,  a  further  result  of  the  smearing. 

The  smearing  is  a  direct  result  of  randomness  in  the  cloud  field  and  the  small  size  of  the  training 
set.  Clear  improvements  are  evident  when  the  size  of  the  training  set  is  doubled  from  three  to  six 
days.  There  is  no  reason  to  expect  that  this  trend  wUl  not  continue  for  several  doublings  in 
training  set  size. 


38 


790901 


790903 


790906 


790910 


790912 


Figure  4-2.  Total  cloud  cover  forecasts  for  unfiltered  pixel-by-pixel  EMDA  data  on  day  79. 
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Table  4-2  Skill  scores  for  day  79  forecast. 


Skill  Score 

Forecast  Duration  (hours)  | 

1 

3 

6 

10 

12 

Brier 

.03 

.04 

.04 

.07 

.05 

ESS 

.29 

.11 

.06 

.02 

.02 

20/20 

.91 

.85 

.83 

.71 

.77 

4.2  MEDIAN  FILTERED  NN. 

As  discussed  in  Section  2,  much  of  the  cloud  field  is  random,  not  predictable.  Yet,  the  NN  dis¬ 
cussed  in  Section  4.1  was  trained  using  data  including  this  random  cloud  content.  In  the  best  of 
all  worlds  where  training  data  is  not  at  a  premium,  the  random  cloud  data  would  cause  no  prob¬ 
lem;  the  NN  would  learn  to  treat  the  random  fluctuations  as  a  source  of  noise.  This  cannot  be 
expected  firom  a  limited  (6  days)  training  set,  however. 

The  above  problem  is  addressed  by  filtering  the  training  data.  A  median  filter  was  selected  be¬ 
cause  it  is  a  nonlinear  filter  that  maintains  the  sharpness  of  cloud  boundaries  better  than  a  simple 
smoothing  filter.  Yet,  isolated  (random)  clouds  will  not  pass  through  the  filter.  Figure  4-3  il¬ 
lustrates  the  effect  of  a  median  filter  on  a  cloud  image.  In  the  figure,  the  original  cloud  image  is 
compared  to  three  levels  of  filtering.  The  7x7  filter  removes  many  of  the  larger  cloud  masses. 
The  3  X  3  or  5  X  5  filter  removes  most  of  the  isolated  clouds  but  maintains  the  larger  cloud 
groups.  The  5x5  filter  was  selected  for  training  purposes. 

Forecasting  with  the  median  filtered  images  (Figures  4-4, 4-5,  and  Table  4-3)  as  input  has  the 
expected  effect.  The  forecasts  are  sharper  yet.  There  are  far  fewer  large  regions  of  sparse 
clouds  covering  cloudless  areas  in  the  trath  image. 

Several  attributes  of  the  NN  are  evident  in  the  forecasts.  First,  there  appears  to  be  more  evolu¬ 
tion  and  persistence  in  the  clouds  than  advection.  When  advection  clearly  dominates  the  cloud 
history  as  on  day  81  (Figure  4-6),  the  NN  stiU  tries  to  evolve  the  cloud  field.  The  advected 
feature  is  apparent  in  the  forecast  as  the  region  of  highest  cloud  fraction,  but  is  obscured  by 
forecast  regional  persistence.  The  median  filtering  appears  to  have  little  impact  on  the  forecast 
Day  79  (Figure  4-4)  is  dominated  by  scattered  clouds.  As  the  forecast  time  gets  longer,  the 
ability  of  the  NN  to  forecast  the  details  greatly  diminishes  until  at  12  hours,  only  the  largest 
cloud  groups  are  forecast  These  predictable  cloud  groups  are  the  result  of  evolution  along  the 
coast  of  southern  Europe. 
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No  Filter 


3X3 


Figure  4-3.  Effects  of  median  filtering  on  the  cloud  image. 
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a) 

790901 


790903 


790906 


790910 


790912 


to  Prediction  Truth 


(The  number  code  above  denotes  the  day/time/forecast  duration.) 


Figure  4-4.  Day  79  EMDA  forecasts  for  a  neural  network  trained  on  median  filtered  data  (a)  using 
unfiltered  input,  and  (b)  using  median  filtered  input. 
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b) 

790901 


790903 


790906 


790910 


790912 


to  Prediction  Truth 


(The  number  code  above  denotes  the  day/time/forecast  duration.) 


Figure  4-4.  Day  79  EMDA  forecasts  for  a  neural  network  trained  on  median  filtered  data  (a)  using 
unfiltered  input,  and  (b)  using  median  filtered  input  (Continued). 
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a) 

801501 


801503 


801506 


801509 


801512 


to  Prediction  Truth 


(The  number  code  above  denotes  the  day/time/forecast  duration.) 


Figure  4-5.  Day  80  EMDA  forecasts  for  a  neural  network  trained  on  median  filtered  data  (a)  using 
unfiltered  input,  and  (b)  using  median  filtered  input. 
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to 


Prediction 


Truth 


b) 

801501 


801503 


801506 


801509 


801512 


(The  number  code  above  denotes  the  day/time/forecast  duration.) 

Figure  4-5.  Day  80  EMDA  forecasts  for  a  neural  network  trained  on  median  filtered  data  (a)  using 
unfiltered  input,  and  (b)  using  median  filtered  input  (Continued). 


a) 

812101 


812103 


812106 


812109 


812112 


(The  number  code  above  denotes  the  day/time/forecast  duration.) 


Figure  4-6.  Day  81  EMDA  forecasts  for  a  neural  network  trained  on  median  filtered  data  (a)  using 
unfiltered  input,  and  (b)  using  median  filtered  input. 
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b) 

812101 


812103 


812106 


812109 


812112 


(The  number  code  above  denotes  the  day/time/forecast  duration.) 


Figure  4-6.  Day  81  EMDA  forecasts  for  a  neural  network  trained  on  median  filtered  data  (a)  using 
unfiltered  input,  and  (b)  using  median  filtered  input  (Continued). 
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Table  4-3.  Skill  scores  for  median  filter  training. 


Forecast  Duration  (Hours) 

3 

6 

10 

I 

Forecast  Time 

MF2 

ui 

MF2 

MF2 

ui 

MF2 

ui 

MF2 

Dav  7909 

Brier 

.02 

.01 

.03 

.02 

.04 

.02 

.04 

.02 

.05 

.02 

ESS 

.30 

.18 

.03 

.02 

.50 

.02 

.04 

.05 

.03 

.05 

20/20 

.93 

.97 

.89 

.95 

.88 

.90 

.88 

.91 

.87 

.90 

Dav  8015 

Brier 

.02 

.00 

.03 

.01 

.02 

.01 

.05 

.04 

.05 

.05 

ESS 

.13 

.00 

.00 

-.02 

-.02 

.00 

-.04 

-.02 

-.04 

-.02 

20/20 

.95 

.99 

.94 

.98 

.94 

.96 

.84 

.85 

.84 

.84 

Dav  8121 

Brier 

.04 

.03 

.06 

.02 

.04 

.11 

.10 

.12 

.08 

.09 

ESS 

.54 

.58 

.37 

.02 

.50 

.26 

.22 

.23 

.07 

.00 

20/20 

.88 

.92 

.69 

.95 

.88 

.52 

.54 

.55 

.75 

.76 

''  Unfiltered  input;  ^Median  filtered  input 


A  limited  number  of  days  (and  hours  during  the  day)  of  central  American  (CNSA)  SERCAA 
cloud  images  were  available  for  verification.  The  data  set  was  too  small  for  complete  training  of 
a  separate  NN  so  the  EAS A  NN  was  further  trained  using  the  available  CNSA  data.  The  first 
four  days  of  CNSA  was  used  for  training.  No  attempt  was  made  to  optimize  the  input  predictors 
as  was  previously  done  with  EAS  A;  the  same  predictors  were  input  TTie  fact  that  the  NN  is 
capable  of  forecasting  in  a  completely  different  region  than  the  training  region  demonstrates  the 
versatility  of  the  approach. 

The  only  reason  for  any  additional  training  is  due  to  the  inclusion  of  latitude  and  longitude  as 
predictors.  Training  on  the  EAS  A  locations  alone  resulted  in  a  NN  that  forecasts  no  clouds  out¬ 
side  of  the  EAS  A  region.  In  the  future  this  problem  should  be  rectified  by  eliminating  longitude 
as  a  predictor  and  better  optimizing  the  use  of  land  type  predictors. 

The  CNSA  images  have  far  more  clouds  than  EASA.  Figure  A-5  (Appendix  A)  also  shows  that 
the  Central  American  clouds  are  almost  completely  dominated  by  evolution  dr  persistence  with 
little  advection  evident.  Figure  4-7  shows  the  forecasts  for  2300  Z  on  day  84.  Again  note  that 
the  truth  time  is  constant  but  the  time  from  which  the  forecast  is  made  (t^)  changes  with  the 
forecast  length  (1, 2,  6,  and  9  hours).  The  12  hour  forecast  is  omitted  due  to  data  gaps. 
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842303 


842306 


842309 


(The  number  code  above  denotes  the  day/time/forecast  duration.) 


Figure  4-7 .  Day  84  CNSA  forecasts  for  a  neural  network  trained  on  median  filtered  data  (a)  using 
unfiltered  input,  and  (b)  using  median  fiitered  input. 
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842303 


842306 


842309 


(The  number  code  above  denotes  the  day/time/forecast  duration.) 


Figure  4-7.  Day  84  CNSA  forecasts  for  a  neural  network  trained  on  median  filtered  data  (a)  using 
unfiltered  input,  and  (b)  using  median  filtered  input  (Continued). 
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It  is  difficult  to  draw  too  many  conclusions  from  the  CNS A  forecasts  because  the  clouds  do  not 
change  much  over  the  nine  hours  of  the  forecast.  As  a  result  the  forecasts  look  much  like  the 
input  advection  field  which  looks  much  like  the  truth.  The  bulk  of  the  clouds  are  forecast  cor¬ 
rectly.  Again,  too  many  clouds  are  forecast  in  some  places. 

4.3  COMPARISON  TO  HRCP  PERFORMANCE. 

SldJl  score  information  for  overlapping  analysis  times  is  not  available  for  HRCP.  However,  the 
monthly  HRCP  verification  statistics  report  provides  some  insight  into  the  relative  performance 
of  the  two  models  (HRCP  and  WCPM). 

HRCP  performance  is  documented  in  terms  of  the  rms  error  for  3, 6  and  9  hour  forecasts  on  a 
monthly  basis.  No  differentiation  is  made  in  terms  of  the  local  time  of  the  forecast  Perform¬ 
ance  is  also  categorized  in  terms  of  environmental  zones  with  the  tropical  zone  best  for  our  com¬ 
parison  purposes. 

Combining  the  last  fourteen  months  of  performance  reports  results  in  an  average  rms  error  for  3, 
6  and  9  hour  forecasts  of  29%,  38%  and  39%  respectively.  These  should  be  compared  to  the 
square  root  of  the  Brier  scores  previously  discussed.  Typical  Brier  scores  from  WCPM  indicate 
rms  errors  for  3, 6  and  9  hour  forecasts  of  17%,  20%  and  25%. 

Although  a  significant  improvement  in  forecasting  ability  is  indicated,  the  comparison  might  be 
somewhat  deceptive.  The  WCPM  results  were  obtained  for  a  few  realizations  in  one  area;  the 
HRCP  results  include  many  realizations  in  aU  tropical  areas.  The  WCPM  results  are  for  a  16* 
mesh  resolution  while  the  HRCP  results  are  for  an  8*  (or  larger)  mesh  resolution.  Rigorous 
comparisons  should  not  be  made  based  upon  these  preliminary  results. 
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SECTION  5 

FORECAST  IMPROVEMENTS  — OBJECT  ORIENTED  APPROACH 


The  greater  success  of  the  median  filtered  forecasts  and  the  discussion  in  Section  2  of  cloud 
randomness  strongly  suggests  a  potentially  superior  approach  to  cloud  forecasting,  an  object  ori¬ 
ented  approach.  The  object  oriented  approach  takes  the  concept  of  cloud  layers  one  step  farther 
and  isolates  cloud  “masses,  clusters  or  systems”  for  analysis.  The  analysis  and  forecasting  then 
takes  into  account  spatial  correlation  and  relationships  in  the  clouds  and  thus  becomes  more 
similar  to  that  performed  by  a  meteorologist. 

The  meteorological  “object”  information  will  be  obtained  from  a  cloud  segmentation  and  classi¬ 
fication  analysis  of  remote  sensing  multi-spectral  and  microwave  radiometer  images  combined 
with  available  NWP  data  (e.g.  National  Meteorological  Center  data,  NOGAPS  data,  European 
Center  for  Medium  Range  Weather  Forecasting  data).  The  segmentation  and  classification 
analysis  will  either  replace  the  nephanalysis  or  defer  the  need  for  a  nephanalysis  until  after  the 
forecast  is  made.  The  new  process  is  illustrated  in  Figure  5-1  by  the  structure  of  the  NN.  The 
following  will  discuss  the  critical  aspects  of  this  approach. 


Figure  5-1 .  Proposed  structure  of  the  NN  for  theater  area  cloud  forecast. 
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The  greatest  change  proposed  in  the  forecast  model  is  the  inclusion  of  image  object  advec- 
tion/persistence/evolution  descriptive  input  The  role  of  the  meteorological  objects  or  weather 
systems  in  the  analysis  will  be  twofold.  First,  it  will  enable  a  coarser  forecast  capability  at  a 
regional  level  (about  twice  the  size  of  the  theater  area  of  interest)  to  be  related  to  cloud  proper¬ 
ties.  Second,  it  will  partially  or  totally  take  the  place  of  a  nephanalysis  in  that  the  cloud  fractions 
and  heights  will  be  properties  of  the  objects. 

5.1  CLOUD  SEGMENTATION. 

A  variation  of  the  NN  cloud  segmentation  algorithm  of  Peak  and  Tag  (1988)  will  be  developed 
to  perform  the  object  or  weather  system  identification.  Whereas  Peak  and  Tag  was  mainly  inter¬ 
ested  in  isolating  contiguous  cloud  masses  from  visible  satellite  data  and  identifying  cloud  types, 
the  current  need  is  to  identify  and  characterize  a  weather  system,  its  motion,  and  its  evolution. 

Each  weather  system  will  be  characterized  by  its  location,  physical  size,  intensity,  motion  and 
rate  of  growth.  Associated  cloud  masses  will  be  characterized  in  terms  of  cloud  type,  height, 
motion,  moisture  content,  and  texture,  along  with  the  evolution  of  these  parameters.  It  is  the 
cloud  mass  characterization  that  will  form  the  basis  of  the  forecast 

At  this  point  it  is  anticipated  that  the  cloud  segmentation  and  description  will  be  performed 
using  multi-spectral  and  microwave  remote  sensing  as  well  as  NWP  input.  Existing  algorithms 
use  one  or  the  other  but  not  both  to  identify  cloud  types.  Peak  and  Tag  use  a  hierarchical 
approach  and  a  NN  for  segmentation.  He  can  include  many  size,  shape  and  texture  parameters 
as  input  to  the  NN.  Formally,  the  approach  can  accommodate  multi-sensor  information  or  other 
meteorological  information.  Peak’s  segmentation  methodology  was  investigated  and  rejected  for 
use  in  the  pixel-by-pixel  worldwide  cloud  forecast  model  but  is  quite  appropriate  to  an  object 
oriented  approach. 

Liu,  Chury,  and  Sheu  (1995)  have  classified  clouds  into  non-traditional  categories  based  upon 
combined  infrared  and  microwave  radiometer  data  to  predict  precipitation.  This  approach  ap¬ 
pears  to  be  more  appropriate  to  cloud  and  precipitation  forecasting  than  standard  cloud  classifi¬ 
cation  because  the  classification  (Table  5-1)  is  based  upon  those  characteristics  of  the  clouds 
most  diagnostic  of  precipitation.  In  particular,  clouds  are  classified  based  upon  present  and  past 
moisture  content  and  cloud  top  height.  No  actual  segmentation  was  attempted.  The  addition  of 
microwave  radiometer  data  opens  the  possibility  to  forecast  precipitation  of  various  types  as  well 
as  cloud  thickness. 
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Table  5-1 .  Schematic  diagram  of  microwave  index  (/)  versus  cloud  top  temperature  for  cloud 
classification. 


Thin  High-Top 
Nonprecipitating 
Cloud 

Deep  High-Top 
Nonprecipitating 
Cloud 

Anvil  With 
Stratiform 
Precipitating 
Cloud 

Deep  Convective 
Precipitating 
Cloud 

Midtop  Nonprecipitating  Cloud 

Midtop  Precipitating  Cloud 

Warm  Nonprecipitating  Cloud 

Warm  Precipitating  Cloud 

0.25  0.0  0.75 


/(Microwave  Index) 


(Liu,  Curry,  and  Sheu,  1995) 


5.2  CLOUD  MOISTURE. 

Moisture  is  obviously  a  key  input  into  the  cloud  forecasting  model.  Numerical  weather  predic¬ 
tion  (NWP)  models  rely  on  radiosonde  data  for  moisture  field  initialization.  Numerous  tech¬ 
niques  were  developed  in  the  past  10  years  to  utilize  satellite  multi-spectral  and  microwave 
radiometer  data  to  infer  vertical  profiles  of  water  vapor  and  precipitation  in  the  atmosphere  (e.g., 
Weng  and  Grody,  1994;  Jones  and  Vender  Haar,  1990;  Liu,  Curry,  and  Sheu,  1995).  Multi- 
spectral  data  (e.g.  AVHRR)  is  effective  at  estimating  water  vapor  content  in  clear  sky  regions, 
but  depends  on  an  accurate  temperature  profile.  If  free  water  is  also  present,  better  results  are 
achieved  using  microwave  radiometer  systems  such  as  SSM/I  and  SSM/T-1  or  /T-2  (e.g., 
Falcone  et  al.,  1992;  Butler,  Meredith,  and  Stogryn,  1996;  Butler,  Meredith,  and  Rosenberg, 
1992).  Comparisons  with  radiosonde  data  are  good  within  the  operating  zone  of  the  radiosonde. 
Simulations  have  also  demonstrated  the  expected  sensitivity  to  upper  atmosphere  relative 
humidity  (e.g.,  Butler,  Meredith,  and  Rosenberg,  1992). 

5.3  PIXEL-BY-PIXEL  DATA. 

The  pixel  data  still  remains  an  important  part  of  the  forecast  input  for  two  reasons.  First,  per¬ 
sistence  information  (e.g.,  diurnal  or  multiple  day  trends)  can  best  be  characterized  on  a  pixel- 
by-pixel  basis.  Secondly,  the  pixel-by-pixel  data  retains  the  highest  resolution  information. 
Neither  should  be  discarded.  Advective  and  evolutionary  information  will  now  be  relegated  to 
the  object  level.  The  number  of  pixel  level  model  predictors  will  therefore  be  greatly  reduced. 
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5.4  UNIVERSAL  PARAMETERS. 


A  NN  is  relatively  unlimited  in  terms  of  the  quantity  of  data  that  can  be  input  However,  once 
trained,  it  is  severely  limited  in  terms  of  the  types  of  data  it  can  accept  without  a  complete  new 
training.  The  choice  of  universal  input  parameter  definition  is  therefore  critical  and  will  be 
one  of  the  first  tasks  performed. 

Raw  multi-spectral  and  microwave  radiometer  data  represents  brightness  temperatures  at  differ¬ 
ent  frequencies  and  must  be  interpreted  in  terms  of  frequency.  The  frequency  dependence  must 
be  removed  before  it  can  be  algebraically  combined.  The  frequency  dependence  is  removed  if 
physical  parameters  such  as  vapor  pressure,  free  water  content,  trae  temperature,  etc.  are  esti¬ 
mated.  Similar  parameters  are  estimated  from  radiosonde  data  and  thus  can  be  easily  included  in 
the  analysis.  Other  data  such  as  wind  vectors  can  be  directly  included  into  the  advection 
analysis. 
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APPENDIX  A 
DATA 


Data  for  three  study  regions  were  provided  as  the  primary  database  for  the  WCPM.  All  three 
regions  (EASA— East  Asia,  CNSA— Central  and  South  America,  EMDA— Eastern 
Mediterranean  Sea)  were  tropical  (Figure  A-1),  The  total  data  set  is  listed  in  Table  A-1  and 
includes  the  SERCAA,  NOGAPS,  RTNEPH  and  terrain  data. 


Figure  A-1 .  Regions  for  which  SERCAA  data  was  supplied. 
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Table  A-1.  WCPM  datbase. 


Type 

Description 

Size  (MB) 

Numerical  weather  prediction 

NOGAPS 

1,656 

Feb-Apr  1993  and  1994 

May-Jul  1993  and  1994 

Terrain 

TACNEPH 

2 

SERCAA 

Elevation 

63 

Geography 

17 

Nephanalysis 

RTNEPH 

6,660 

Feb-Apr  1993  and  1994 

May-Jul  1993  and  1994 

SERCAA 

EASA  22-30  Mar  1993 

1,180* 

EASA  22-31  Julig93 

1,323 

CNSA  22-31  Mar  1994 

495 

EMDA  12-21  Mar  1994 

313 

SERCAA  data  processing  was  still  being  developed  during  the  first  year  of  this  contract.  Much 
of  the  EAS  A  and  CNS  A  data  received  at  PSR  was  unsuitable  for  our  purposes  due  to  the 
presence  of  processing  artifacts.  Hence,  neural  network  training  focused  on  the  EMDA  data.  In 
addition,  Level  4  SERCAA  processing  (the  integrated  results  from  all  available  satellites)  was 
unusable  due  to  poor  satellite  merging  and  equalization.  WCPM  processing  utilized  only  the 
GOES-6  data  in  the  SERCAA  Level  3  data  set. 

A.1  TERRAIN  DATA. 

Figure  A-2  shows  the  input  SERCAA  land  type  data  for  area  EMDA.  Five  land  types  were 
utilized  in  this  study. 

A.2  SERCAA  DATA. 

SERCAA  algorithms  incorporate  high-resolution  sensor  data  from  multiple  military  and  civilian 
satellites,  polar  and  geostationary,  into  a  real-time  cloud  analysis  model  and  apply  multispectral 
cloud  analysis  techniques  that  improve  the  detection  of  clouds.  The  SERCAA  algorithms 
consist  of  a  number  of  processes  involved  in  integrating  cloud  analyses  from  multiple  satellite 
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Figure  A-2.  Location  and  land  type  map  for  EMDA. 
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platforms  into  a  single  cloud  analysis  product.  The  steps  required  to  process  the  raw  sensor  data, 
collected  from  each  of  the  platforms,  into  each  of  the  individual  cloud  analysis  products  include 
total  cloud  algorithms  for  DMSP,  AVHRR,  GOES,  METEOSAT  and  geostationary  platforms, 
cloud  layer  and  type  algorithms,  and  an  analysis  integration  algorithm  (see  Sarkisian  et  al., 

1994). 

SERCAA  products  are  available  with  three  levels  of  processing.  Level  1  represents  raw, 
individual  satellite  data.  Level  2  represents  corrected  satellite  data.  Level  3  represents 
individual  satellite  nephanalyses.  Level  4  merges  the  individual  satellite  Level  3  analyses.  Final 
SERCAA  products  are  reported  on  a  16*  mesh  and  include  total  cloud  cover  fraction,  number  of 
cloud  layers  (4  maximum),  cloud  layer  coverage  fraction,  cloud  type,  cloud  height,  and  a 
measure  of  confidence. 

The  primary  data  used  for  analysis  is  shown  in  Figiu’e  A-3.  Although  hourly  images  (with  a  few 
gaps)  are  available,  the  Level  3  total  cloud  fraction  is  shown  at  three  hour  increments. 

A.3  NOGAPS  DATA. 

NOGAPS  "data"  is  provided  by  FNMOC.  NOGAPS  is  the  output  of  a  large  NWP  code  that  also 
assunilates  satellite  weather  observations  to  produce  a  current  prediction  of  the  numerical 
parameters  previously  shown  in  Table  3-2  and  a  12  hour  forecast  of  those  parameters.  A 
complete  description  of  the  NOGAPS  algorithms  is  beyond  the  scope  of  this  report.  The  reader 
is  referred  to  reports  specifically  related  to  NOGAPS  for  model  details.  NOGAPS  does  not 
produce  a  nephanalysis. 

NOGAPS  provides  a  global  analysis  and  a  12-hour  forecast  twice  daily  at  00  and  12  Z  on  a  2.5  x 
2.5  degree  latitude/longitude  grid.  The  resolution  at  60°  N  is  139  km,  decreasing  to  278  km  at 
the  equator.  In  contrast,  SERCAA  data  is  available  hourly  (nominally)  and  the  resolution  of 
16th-mesh  SERCAA  data  at  60°  N  is  23.8  km,  increasing  toward  the  equator.  TTie  current 
NOGAPS  operational  model  is  higher  resolution  (0.75  x  0.75  degree)  but  unfortunately  no 
archived  data  is  available  for  the  1993  and  1994  times  corresponding  to  the  SERCAA  data  sets. 

Figures  A-4  and  A-5  show  NOGAPS  data  corresponding  to  the  SERCAA  EMDA  data. 
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Figure  A-3.  SERCA  level  3  total  cloud  fraction  on  days  71  through  81  for  EMDA. 
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Figure  A-3.  SERCA  level  3  total  cloud  fraction  on  days  71  through  81  for  EMDA  (Continued). 
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Figure  A-3.  SERCA  level  3  total  cloud  fraction  on  days  71  through  81  for  EMDA  (Continued). 
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Figure  A-3.  SERCA  level  3  total  cloud  fraction  on  days  71  through  81  for  EMDA  (Continued). 
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Figure  A-3.  SERCA  level  3  total  cloud  fraction  on  days  71  through  81  for  EMDA  (Continued). 
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Figure  A-3.  SERCA  level  3  total  cloud  fraction  on  days  71  through  81  for  EMDA  (Continued). 
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Figure  A-3.  SERCA  level  3  total  cloud  fraction  bn  days  71  through  81  for  EMDA  (Continued). 
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Figure  A-3.  SERCA  level  3  total  cloud  fraction  on  days  71  through  81  for  EMDA  (Continued). 
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Figure  A-3.  SERCA  level  3  total  cloud  fraction  on  days  71  through  81  for  EMDA  (Continued). 
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Figure  A-3.  SERCA  level  3  total  cloud  fraction  on  days  71  through  81  for  EMDA  (Continued). 
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Total  Cloud  Fraction  -  CNSA 
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Figure  A-5.  SERCA  level  3  total  cloud  fraction  on  days  80  through  91  for  CSNA. 
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Figure  A-5.  SERCA  level  3  total  cloud  fraction  on  days  80  through  91  for  CSNA  (Continued). 
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Figure  A-5.  SERCA  level  3  total  cloud  fraction  on  days  80  through  91  for  CSNA  (Continued). 
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Figure  A-5.  SERCA  level  3  total  cloud  fraction  on  days  80  through  91  for  CSNA  (Continued). 
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Figure  A-5.  SERCA  level  3  total  cloud  fraction  on  days  80  through  91  for  CSNA  (Continued). 
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Figure  A-5.  SERCA  level  3  total  cloud  fraction  on  days  80  through  91  for  CSNA  (Continued). 


Total  Cloud  Fraction  -  CNSA 


CM  T- 
CM  ^ 

lO  CD 
00  00 


CO 

U) 

▼- 

CO 

CO 

CO  CO 

00 

00 

00  00 

A-34 


Figure  A-5.  SERCA  level  3  total  cloud  fraction  on  days  80  through  91  for  CSNA  (Continued). 
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Figure  A-5.  SERCA  level  3  total  cloud  fraction  on  days  80  through  91  for  CSNA  (Continued). 
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Figure  A-5.  SERCA  level  3  total  cloud  fraction  on  days  80  through  91  for  CSNA  (Continued). 
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Figure  A-5.  SERCA  level  3  total  cloud  fraction  on  days  80  through  91  for  CSNA  (Continued). 
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Figure  A-5.  SERCA  level  3  total  cloud  fraction  on  days  80  through  91  for  CSNA  (Continued). 
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APPENDIX  B 

SKILL  SCORE  DEFINITION 


The  skill  scores  used  in  this  study  are  completely  defined  and  described  in  Gandin  and 
Mtuphy  (1992).  For  completeness,  a  short  definition  of  each  is  provided  below. 

Brier  Score: 

The  Brier  score  is  the  standard  measure  of  the  mean  square  error  between  the  forecast 
and  the  truth.  There  is  not  selective  weighting  or  penalty.  The  score  is  normalized  to 
range  from  0  to  1  with  a  score  of  0  being  perfect. 

Equitable  Skill  Score  (ESSl: 

The  ESS  is  defined  to  emphasize  an  ability  to  accurately  forecast  the  unusual  in  an 
image.  ESS  also  penalizes  for  poor  forecasts.  To  accomplish  this  ESS  weights  points 
approximately  inversely  proportional  to  their  fi'actional  occurrence.  Therefore,  the 
correct  forecast  of  an  isolated  feature  is  very  heavily  weighted  while  the  correct  forecast 
of  the  most  commonly  occurring  values  is  lightly  weighted.  Conversely,  incorrect 
forecasts  are  negatively  weighted  according  to  their  fractional  occurrence.  Negative  ESS 
is  therefore  possible.  ESS  can  range  from  -1  to  +1  with  +1  being  perfect 

Evaluating  the  ESS  is  very  difficult  because  the  concept  of  a  “good”  score  is  image 
dependent  Low,  but  positive,  ESS  (0.1  to  0.2)  is  expected  for  the  cloud  fields  shown  in 
Figures  A-2  to  A-5.  The  cloud  fields  are  very  broken.  Isolated  pixels  of  clouds  are 
observed  and  are  randomly  populated.  The  opportunity  for  forecast  penalties  is  very 
large  due  to  these  isolated  pixels. 

mm. 

The  20/20  score  simply  measures  the  ability  of  the  forecast  to  predict  within  ±20%  of  the 
correct  value.  There  is  no  weighting  according  to  the  chance  of  occurrence  for  that 
value.  Scores  range  between  0  and  1  with  a  score  of  1  being  perfect 

The  EMDA  data  is  dominated  by  areas  of  no  clouds  (about  70%  of  the  pixels  have  no 
clouds).  The  20/20  score  should  be  high  if  the  model  accurately  forecasts  “no  clouds”. 

In  regions  of  uniform  clouds  (no  clouds  or  all  clouds)  the  scores  are  expected  to  exceed 
0.8.  In  areas  of  scattered  clouds  lower  scores  of  0.5  to  0.6  are  expected. 


B-1 


Correlation  (CORK): 

This  is  the  standard  zero-lag  image  correlation  coefficient  given  by 


CORR  =  ^-Zx,y„ 

ij 


(B.l) 


where  and  y^.  are  pixels  in  the  forecast  and  trath  images  each  containing  N  pixels.  This 
score  emphasizes  the  correct  forecasting  of  larger  features  in  the  images.  Since  cloudless 
pixels  have  a  value  near  zero,  the  cloudy  pixels  contribute  most  to  the  correlation. 
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